因子分析是一种常用的统计分析方法,用于探究多个变量之间的关系,识别其中存在的共性因素,并将它们组合成更少的维度(因子)来解释这些变量的变异。
举例来说,假设我们有一份问卷调查数据,包含了多个关于个人健康状况的问题,例如体重、身高、运动频率、饮食习惯等。这些变量之间可能存在着某些相互关联的关系,比如身高和体重可能呈现正相关关系,而饮食习惯和健康状况可能也存在着一定的联系。
通过因子分析,我们可以识别出其中的共性因素,例如“身体健康”这个因素,它可能包含体重、身高、运动频率等变量,同时“饮食习惯”这个因素,它可能包含饮食结构、水果蔬菜摄入量等变量。通过将这些变量组合成更少的维度(即因子),我们可以更好地理解这些变量之间的关系,同时也可以减少对于变量的测量误差和分析复杂度。
因子分析和主成分分析都是用于数据降维的方法,它们有一定的相似性,但是它们的目标和实现方式不同。
主成分分析是一种线性变换方法,它的主要目标是将高维度数据降维到低维度空间中,并尽量保留原始数据的信息,即通过找到能够解释数据方差最多的几个主成分来代表原始数据。主成分分析通常适用于数值型数据,它能够识别出变量之间的线性关系,但无法处理非线性关系。
而因子分析则更侧重于探究多个变量之间的共性因素,以便更好地理解这些变量之间的关系。因子分析通常适用于观测到的变量,能够找到潜在的因子,以便更好地解释原始数据,但是它无法保留所有的原始数据信息。
应用因子分析法的主要步骤如下:
利用Python进行因子分析的核心库是:factor_analyzer.安装方式为:pip install factor_analyzer.它提供了一系列函数和类,可以用来执行各种因子分析技术,如主成分分析、最小偏差法、极大似然估计法等,以及进行因子旋转、因子得分计算等。该库也提供了多个方法来查看因子分析的结果,如因子载荷、共性方差、因子方差等。以下是一些常用的方法:
这里使用一份学生成绩数据进行上述库操作的介绍,先导入库
# 数据处理
import pandas as pd
import numpy as np
# 绘图
import seaborn as sns
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei'] # 指定默认字体
plt.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题
# 因子分析
from factor_analyzer import FactorAnalyzer
再导入数据
df = pd.read_excel('data/grades2.xlsx',index_col=0).iloc[:,:-3]
df = df.dropna()
df.head()
| 考号 | 语文 | 数学 | 英语 | 物理 | 化学 | 生物 |
|---|---|---|---|---|---|---|
| 532 | 117 | 142 | 136.5 | 95 | 88 | 78 |
| 430 | 120 | 136 | 136.5 | 90 | 86 | 79 |
| 322 | 121 | 142 | 131 | 88 | 78 | 74 |
| 213 | 110 | 135 | 135.5 | 91 | 86 | 85 |
| 319 | 116 | 128 | 132 | 90 | 99 | 82 |
| 407 | 120 | 125 | 140 | 83 | 77 | 82 |
| 207 | 126 | 120 | 134 | 83 | 87 | 84 |
| 126 | 108 | 141 | 127.5 | 97 | 74 | 80 |
| 144 | 123 | 129 | 123.5 | 87 | 81 | 80 |
| 524 | 120 | 127 | 137 | 86 | 72 | 70 |
在进行因子分析之前,需要先进行充分性检测,主要是检验相关特征阵中各个变量间的相关性,是否为单位矩阵,也就是检验各个变量是否各自独立。
检验总体变量的相关矩阵是否是单位阵(相关系数矩阵对角线的所有元素均为1,所有非对角线上的元素均为零);即检验各个变量是否各自独立。
from factor_analyzer.factor_analyzer import calculate_bartlett_sphericity
chi_square_value, p_value = calculate_bartlett_sphericity(df)
chi_square_value, p_value
(638.4878993629553, 2.3317604587216965e-126)
如果不是单位矩阵,说明原变量之间存在相关性,可以进行因子分子;反之,原变量之间不存在相关性,数据不适合进行因子分析
检查变量间的相关性和偏相关性,取值在0-1之间;KOM统计量越接近1,变量间的相关性越强,偏相关性越弱,因子分析的效果越好。
通常取值从0.6开始进行因子分析
#KMO检验
from factor_analyzer.factor_analyzer import calculate_kmo
kmo_all,kmo_model=calculate_kmo(df)
kmo_model
0.8849915684323234
通过结果可以看到KMO大于0.6,也说明变量之间存在相关性,可以进行分析。
方法:计算相关矩阵的特征值,进行降序排列
faa = FactorAnalyzer(25,rotation=None)
faa.fit(df)
# 得到特征值ev、特征向量v
ev,v=faa.get_eigenvalues()
ev,v
(array([3.76054785, 0.731497 , 0.44376006, 0.38913686, 0.37077393, 0.3042843 ]), array([ 3.42509954e+00, 3.41626411e-01, 8.52906289e-02, 4.38265635e-02, 3.81444552e-02, -8.44783046e-07]))
将特征值和因子个数的变化绘制成图形
# 同样的数据绘制散点图和折线图
plt.scatter(range(1, df.shape[1] + 1), ev)
plt.plot(range(1, df.shape[1] + 1), ev)
# 显示图的标题和xy轴的名字
# 最好使用英文,中文可能乱码
plt.title("Scree Plot")
plt.xlabel("Factors")
plt.ylabel("Eigenvalue")
plt.grid() # 显示网格
plt.show() # 显示图形
在这里选择,最大方差化因子旋转.
# 选择方式: varimax 方差最大化
# 选择固定因子为 2 个
faa_two = FactorAnalyzer(2,rotation='varimax')
faa_two.fit(df)
# 公因子方差
faa_two.get_communalities()
array([0.51896216, 0.61041414, 0.62124855, 0.60977056, 0.6656627 , 0.68163346])
ratation参数的其他取值情况:
查看公因子方差
# 公因子方差
faa_two.get_communalities()
array([0.51896216, 0.61041414, 0.62124855, 0.60977056, 0.6656627 , 0.68163346])
faa_two.get_eigenvalues()
查看它们构成的成分矩阵
# 变量个数*因子个数
faa_two.loadings_
通过理论部分的解释,我们发现每个因子都对变量有一定的贡献,存在某个贡献度的值,在这里查看3个和贡献度相关的指标:
faa_two.get_factor_variance()
为了更直观地观察每个隐藏变量和哪些特征的关系比较大,进行可视化展示,为了方便取上面相关系数的绝对值:
df1 = pd.DataFrame(np.abs(faa_two.loadings_),index=df.columns)
# 绘图
ax = sns.heatmap(df1, annot=True, cmap="BuPu")
# 设置y轴字体大小
ax.yaxis.set_tick_params(labelsize=15)
plt.title("Factor Analysis", fontsize="xx-large")
# 设置y轴标签
plt.ylabel("Sepal Width", fontsize="xx-large")
# 显示图片
plt.show()
上面我们已经知道了2个因子比较合适,可以将原始数据转成2个新的特征,具体转换方式为:
df2 = pd.DataFrame(faa_two.transform(df))
| 0 | 1 | |
|---|---|---|
| 0 | 1.08945 | 1.44028 |
| 1 | 1.28298 | 1.12287 |
| 2 | 0.981882 | 1.05373 |
| 3 | 1.01176 | 1.35829 |
| 4 | 1.2519 | 1.27035 |
| 5 | 1.47917 | 0.607756 |
| 6 | 1.6684 | 0.657462 |
| 7 | 0.350665 | 1.61349 |
| 8 | 1.01576 | 0.998846 |
| 9 | 1.06597 | 0.576816 |
参考资料
# 数据处理
import pandas as pd
import numpy as np
# 绘图
import seaborn as sns
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif'] = ['SimHei'] # 指定默认字体
plt.rcParams['axes.unicode_minus'] = False # 解决保存图像是负号'-'显示为方块的问题
# 因子分析
from factor_analyzer import FactorAnalyzer
df = pd.read_excel('data/grades2.xlsx',index_col=0).iloc[:,:-3]
df = df.dropna()
df.info()
<class 'pandas.core.frame.DataFrame'> Int64Index: 220 entries, 193000532 to 194000234 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 语文 220 non-null int64 1 数学 220 non-null int64 2 英语 220 non-null float64 3 物理 220 non-null float64 4 化学 220 non-null float64 5 生物 220 non-null float64 dtypes: float64(4), int64(2) memory usage: 12.0 KB
from factor_analyzer.factor_analyzer import calculate_bartlett_sphericity
chi_square_value, p_value = calculate_bartlett_sphericity(df)
chi_square_value, p_value
(632.5790053159449, 4.213290629381875e-125)
#KMO检验
from factor_analyzer.factor_analyzer import calculate_kmo
kmo_all,kmo_model=calculate_kmo(df)
kmo_model
0.8859157158309541
faa = FactorAnalyzer(25,rotation=None)
faa.fit(df)
# 得到特征值ev、特征向量v
ev,v=faa.get_eigenvalues()
ev,v
(array([3.76054785, 0.731497 , 0.44376006, 0.38913686, 0.37077393,
0.3042843 ]),
array([ 3.42509954e+00, 3.41626411e-01, 8.52906289e-02, 4.38265635e-02,
3.81444552e-02, -8.44783046e-07]))
# 同样的数据绘制散点图和折线图
plt.scatter(range(1, df.shape[1] + 1), ev)
plt.plot(range(1, df.shape[1] + 1), ev)
# 显示图的标题和xy轴的名字
# 最好使用英文,中文可能乱码
plt.title("Scree Plot")
plt.xlabel("Factors")
plt.ylabel("Eigenvalue")
plt.grid() # 显示网格
plt.show() # 显示图形
# 选择方式: varimax 方差最大化
# 选择固定因子为 2 个
faa_two = FactorAnalyzer(2,rotation='varimax')
faa_two.fit(df)
FactorAnalyzer(n_factors=2, rotation='varimax', rotation_kwargs={})
# 公因子方差
faa_two.get_communalities()
array([0.51896216, 0.61041414, 0.62124855, 0.60977056, 0.6656627 ,
0.68163346])
pd.DataFrame(faa_two.get_communalities(),index=df.columns)
| 0 | |
|---|---|
| 语文 | 0.518962 |
| 数学 | 0.610414 |
| 英语 | 0.621249 |
| 物理 | 0.609771 |
| 化学 | 0.665663 |
| 生物 | 0.681633 |
faa_two.get_eigenvalues()
(array([3.76054785, 0.731497 , 0.44376006, 0.38913686, 0.37077393,
0.3042843 ]),
array([ 3.38462778, 0.32306269, 0.03554346, 0.01132939, -0.01696077,
-0.02991098]))
pd.DataFrame(faa_two.get_eigenvalues())
| 0 | 1 | 2 | 3 | 4 | 5 | |
|---|---|---|---|---|---|---|
| 0 | 3.760548 | 0.731497 | 0.443760 | 0.389137 | 0.370774 | 0.304284 |
| 1 | 3.384628 | 0.323063 | 0.035543 | 0.011329 | -0.016961 | -0.029911 |
# 变量个数*因子个数
faa_two.loadings_
array([[0.65864126, 0.29181132],
[0.39811093, 0.67225131],
[0.73327534, 0.28906024],
[0.25112045, 0.73939778],
[0.59534374, 0.55787859],
[0.60115 , 0.56590824]])
pd.DataFrame(faa_two.loadings_,index=df.columns)
| 0 | 1 | |
|---|---|---|
| 语文 | 0.658641 | 0.291811 |
| 数学 | 0.398111 | 0.672251 |
| 英语 | 0.733275 | 0.289060 |
| 物理 | 0.251120 | 0.739398 |
| 化学 | 0.595344 | 0.557879 |
| 生物 | 0.601150 | 0.565908 |
faa_two.get_factor_variance()
(array([1.90887032, 1.79882124]), array([0.31814505, 0.29980354]), array([0.31814505, 0.61794859]))
df1 = pd.DataFrame(np.abs(faa_two.loadings_),index=df.columns)
df1
| 0 | 1 | |
|---|---|---|
| 语文 | 0.658641 | 0.291811 |
| 数学 | 0.398111 | 0.672251 |
| 英语 | 0.733275 | 0.289060 |
| 物理 | 0.251120 | 0.739398 |
| 化学 | 0.595344 | 0.557879 |
| 生物 | 0.601150 | 0.565908 |
# 绘图
ax = sns.heatmap(df1, annot=True, cmap="BuPu")
# 设置y轴字体大小
ax.yaxis.set_tick_params(labelsize=15)
plt.title("Factor Analysis", fontsize="xx-large")
# 设置y轴标签
plt.ylabel("Sepal Width", fontsize="xx-large")
# 显示图片
plt.show()
# 保存图片
# plt.savefig("factorAnalysis", dpi=500)
faa_two.transform(df)
array([[ 1.08945288e+00, 1.44027576e+00],
[ 1.28297922e+00, 1.12286513e+00],
[ 9.81882460e-01, 1.05373341e+00],
[ 1.01176271e+00, 1.35828922e+00],
[ 1.25190314e+00, 1.27035360e+00],
[ 1.47916949e+00, 6.07756401e-01],
[ 1.66840158e+00, 6.57461515e-01],
[ 3.50664941e-01, 1.61349238e+00],
[ 1.01576388e+00, 9.98845598e-01],
[ 1.06596648e+00, 5.76816242e-01],
[ 5.36797196e-01, 1.28993995e+00],
[ 4.84084571e-01, 1.49562759e+00],
[ 8.11063369e-01, 1.45396597e+00],
[ 6.09444272e-01, 1.34992274e+00],
[ 9.35975375e-01, 1.03592247e+00],
[ 3.97875915e-01, 1.52677274e+00],
[ 1.02033764e-01, 1.26303534e+00],
[ 4.70500271e-01, 7.28530656e-01],
[ 5.77062853e-01, 8.67012058e-01],
[ 7.79018711e-01, 6.82649258e-01],
[ 1.13513851e+00, 2.62999531e-01],
[ 8.67865100e-01, 8.53702022e-01],
[ 2.77636282e-01, 1.14087853e+00],
[ 8.04711698e-01, 6.46230442e-01],
[ 3.22199954e-01, 9.58588086e-01],
[ 1.09934858e+00, 4.92246364e-01],
[ 8.02229103e-01, 5.24663924e-01],
[ 5.47052898e-01, 6.83590517e-01],
[ 5.96601161e-01, 1.05840146e+00],
[ 1.05737998e+00, 3.22028895e-01],
[ 7.60952794e-01, 7.34528011e-01],
[ 7.96197980e-01, 8.43817068e-01],
[ 4.59821124e-01, 9.56094908e-01],
[ 5.39496957e-01, 1.14297920e+00],
[ 5.06955178e-01, 7.33981341e-01],
[ 5.57879357e-01, 9.73203187e-01],
[ 2.01498866e-01, 1.47570694e+00],
[ 1.41024637e+00, -6.81422737e-02],
[ 1.57845360e-01, 9.31698295e-01],
[ 1.02212769e+00, 2.23482876e-01],
[ 3.84658282e-01, 9.18249482e-01],
[ 3.86757895e-01, 1.15781102e+00],
[ 3.56392546e-01, 4.28505292e-01],
[ 3.67185849e-01, 6.19167467e-01],
[ 7.86181224e-01, 3.99844663e-01],
[ 2.91057407e-01, 1.35638983e+00],
[ 1.81963885e-01, 9.66175064e-01],
[ 7.54361327e-01, 3.33111071e-01],
[ 3.46307744e-01, 3.59724734e-01],
[ 6.46661619e-02, 8.27530260e-01],
[ 6.82493542e-01, 4.32297755e-01],
[ 4.74298726e-01, 2.87958915e-01],
[ 4.86911784e-01, 5.96514706e-01],
[ 3.33101638e-01, 7.53490197e-01],
[ 9.18960293e-01, 1.81876681e-01],
[ 1.21984743e+00, -1.04349825e-01],
[ 3.23968577e-03, 9.48425761e-01],
[ 8.22300003e-01, -2.70624478e-01],
[ 9.08032179e-01, -1.02996743e-01],
[ 7.45635478e-01, 2.30050875e-01],
[ 6.80522409e-01, 1.89020431e-01],
[ 2.81230271e-01, 4.87765169e-01],
[ 5.15765784e-01, -9.71952702e-02],
[ 1.30970476e+00, -5.25842932e-01],
[-1.82966905e-01, 9.60816332e-01],
[-4.19610304e-01, 1.05055704e+00],
[ 6.75011791e-01, 1.04804946e-01],
[ 1.81540705e-01, 6.71920379e-01],
[ 1.23909748e-01, 6.63058816e-01],
[ 4.42504142e-01, 9.67412488e-02],
[ 7.95226243e-01, 2.25664120e-01],
[ 1.86594827e-01, 3.88532700e-01],
[-1.81286500e-02, 7.67349376e-01],
[ 3.16222561e-01, 4.33645897e-01],
[ 8.72776502e-02, 6.37502036e-01],
[ 2.92212126e-01, 8.14845482e-02],
[ 1.02040436e+00, -3.47378564e-01],
[ 5.44072008e-01, 4.37151841e-02],
[ 5.89558361e-03, 9.38902876e-01],
[ 7.65447289e-01, -6.72307239e-02],
[ 1.21686371e-01, 3.09442259e-01],
[ 5.50788712e-01, 1.13514140e-02],
[-1.05734489e-01, 7.86139372e-01],
[-1.43049264e-02, 5.92302702e-01],
[ 2.79084247e-02, 4.16015124e-01],
[ 8.15564465e-01, -1.99412626e-01],
[ 1.31569213e-01, 1.40244815e-01],
[ 9.20704225e-01, -3.80445762e-01],
[ 4.40248749e-01, -1.98151339e-01],
[ 5.23850857e-01, -6.34180656e-02],
[-5.61646333e-01, 6.36711919e-01],
[-1.83773513e-01, 3.67107594e-01],
[ 4.32595080e-01, 5.64379657e-02],
[ 8.86763007e-01, -5.41594179e-01],
[ 1.87022335e-01, 2.97612057e-01],
[ 1.41942230e-01, 2.34944855e-01],
[ 6.11405834e-01, -4.50095518e-01],
[ 6.12215380e-01, -1.96476472e-01],
[ 8.68370130e-02, 5.51587875e-01],
[-3.96307873e-01, 6.46831099e-01],
[ 5.46511168e-03, 3.00081990e-01],
[ 4.88977111e-01, -2.28688455e-01],
[ 3.13514245e-01, 1.93587267e-01],
[ 3.30980014e-04, 2.03194580e-01],
[ 2.32800318e-01, -2.83249580e-01],
[ 1.61716180e-01, 8.90899435e-02],
[ 7.75371480e-01, -5.30645945e-01],
[ 7.79187469e-01, -3.97609867e-01],
[ 8.04249266e-02, -8.93352860e-02],
[-4.84108109e-01, 7.25667270e-01],
[ 3.95388348e-01, -4.31324999e-01],
[-1.16082466e-01, 2.14126357e-01],
[-3.03499713e-01, 7.75003561e-01],
[-5.01195087e-02, 3.81268713e-01],
[ 4.03685243e-01, -2.13721758e-01],
[ 2.64961602e-01, -1.71608403e-01],
[-7.04922073e-02, 1.81238123e-01],
[-6.20375569e-01, 4.55609695e-01],
[-3.20149447e-01, 3.95437467e-03],
[ 1.43411626e-01, -1.03345446e-01],
[ 1.48032977e-02, -1.90706825e-01],
[ 7.31605028e-01, -8.78329422e-01],
[-6.07741541e-01, 4.36130072e-01],
[ 5.58257063e-01, -6.01666377e-01],
[-4.36842296e-01, 1.94846006e-01],
[ 8.88461248e-02, -5.37638291e-02],
[ 1.96657329e-01, -2.87602191e-01],
[-5.90481373e-01, 5.86716522e-01],
[ 1.73692291e-01, -2.77550962e-01],
[-4.07077829e-02, -1.72326571e-01],
[-8.45968189e-01, 7.26501452e-01],
[ 2.95498717e-01, -3.44832485e-01],
[-2.07698911e-01, -6.02511691e-02],
[-3.24190293e-01, 5.25217801e-02],
[-1.74586837e-01, 4.17487327e-02],
[-3.06753990e-01, 2.75923386e-01],
[-3.11329292e-02, -5.85604803e-01],
[-3.05719425e-01, 2.00725438e-02],
[ 5.52429258e-01, -8.26227463e-01],
[-6.15007730e-01, 3.53045698e-01],
[-7.66624145e-01, 3.45676001e-01],
[ 5.42992113e-01, -5.81730575e-01],
[-7.72153357e-01, 4.75750898e-01],
[-3.72823687e-01, -1.75835572e-01],
[-2.83853225e-01, -4.78333327e-02],
[-1.56858817e-01, -2.48266945e-01],
[ 1.08585401e+00, -1.04754140e+00],
[ 3.05944717e-01, -8.29895176e-01],
[-4.18390962e-01, 2.02882190e-01],
[-6.33145326e-01, -1.85367459e-01],
[-3.75934175e-01, 7.26872808e-02],
[ 3.77210777e-01, -9.77793280e-01],
[-4.57450179e-01, -4.11900142e-01],
[ 1.04787235e+00, -1.16379664e+00],
[-6.50070950e-01, -1.37222765e-01],
[-1.20461400e+00, 7.82385277e-01],
[ 4.74212006e-01, -1.00804931e+00],
[ 4.50439966e-01, -9.53036747e-01],
[ 1.70909328e-01, -5.88153957e-01],
[ 8.84359203e-02, -6.75590641e-01],
[-5.33396463e-01, -1.54870424e-01],
[-7.70504119e-01, 1.56283342e-01],
[-8.29763955e-01, -9.43885382e-02],
[ 2.05686546e-01, -9.13999466e-01],
[-1.57698444e-01, -7.81908328e-01],
[-2.87913442e-01, -2.45235066e-01],
[-6.61050766e-01, -1.52397559e-01],
[-8.67863182e-01, 3.17610975e-01],
[-3.94570788e-01, -4.27681578e-01],
[ 1.67014084e-01, -8.56350486e-01],
[-2.91289043e-01, -6.75498298e-01],
[-1.54562666e+00, 2.61804526e-01],
[-1.17375048e-01, -8.12557402e-01],
[-8.24700642e-01, -1.37357326e-02],
[-5.63717829e-01, -5.00345226e-01],
[-5.02152358e-01, -7.76532200e-01],
[-8.54368367e-01, -1.44985216e-01],
[-7.43046113e-02, -8.87838183e-01],
[-7.31783614e-01, -4.31137698e-01],
[-1.73919338e+00, 5.35704536e-01],
[ 2.66171415e-01, -1.32131499e+00],
[-8.36214407e-01, -3.32528237e-01],
[-1.62764833e+00, 3.67177498e-01],
[-3.27126509e-02, -1.30015113e+00],
[ 1.32196329e-01, -1.45573794e+00],
[-8.56977665e-02, -1.09267630e+00],
[-1.34473756e+00, -5.36312813e-02],
[-1.36546353e+00, -1.15226393e-01],
[-5.69842901e-01, -8.65446844e-01],
[-1.32056030e+00, -4.76200471e-01],
[-1.16695858e+00, -4.74742689e-01],
[-3.73672213e-01, -1.15912619e+00],
[-1.43479044e+00, -1.99395045e-01],
[ 9.58747167e-01, -2.02481243e+00],
[-2.70072195e+00, 7.82311702e-01],
[ 4.85340642e-02, -1.53910942e+00],
[-1.43229769e+00, -4.19949471e-01],
[-5.48326557e-01, -1.20753417e+00],
[-3.32437154e-01, -1.46188805e+00],
[ 6.70380338e-02, -1.75660291e+00],
[-1.98618686e+00, -1.54890004e-01],
[-6.81063471e-01, -1.19305355e+00],
[-1.99105214e+00, -3.24282287e-01],
[-9.83896084e-01, -1.20481914e+00],
[-7.75514689e-01, -1.54221088e+00],
[-8.82189948e-01, -1.38118693e+00],
[-1.14263641e+00, -1.26461966e+00],
[ 2.84211550e-01, -2.20135510e+00],
[-1.17726074e+00, -1.33621935e+00],
[-5.60248302e-01, -1.68815337e+00],
[-1.36979396e+00, -1.17028755e+00],
[-7.82407954e-01, -1.81841054e+00],
[-1.52625703e+00, -1.10594420e+00],
[-2.19186564e+00, -7.21403813e-01],
[-2.28159392e+00, -7.17850760e-01],
[-1.84679534e+00, -9.99994082e-01],
[-1.09034062e+00, -1.82023686e+00],
[-1.90534313e+00, -1.38780227e+00],
[-9.27873349e-01, -2.50573291e+00],
[-3.93675491e+00, -2.41027667e+00]])
df2 = pd.DataFrame(faa_two.transform(df))
df2
| 0 | 1 | |
|---|---|---|
| 0 | 1.089453 | 1.440276 |
| 1 | 1.282979 | 1.122865 |
| 2 | 0.981882 | 1.053733 |
| 3 | 1.011763 | 1.358289 |
| 4 | 1.251903 | 1.270354 |
| ... | ... | ... |
| 215 | -1.846795 | -0.999994 |
| 216 | -1.090341 | -1.820237 |
| 217 | -1.905343 | -1.387802 |
| 218 | -0.927873 | -2.505733 |
| 219 | -3.936755 | -2.410277 |
220 rows × 2 columns
print(df2.head(10).to_markdown())
| | 0 | 1 | |---:|---------:|---------:| | 0 | 1.08945 | 1.44028 | | 1 | 1.28298 | 1.12287 | | 2 | 0.981882 | 1.05373 | | 3 | 1.01176 | 1.35829 | | 4 | 1.2519 | 1.27035 | | 5 | 1.47917 | 0.607756 | | 6 | 1.6684 | 0.657462 | | 7 | 0.350665 | 1.61349 | | 8 | 1.01576 | 0.998846 | | 9 | 1.06597 | 0.576816 |